Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates

نویسندگان

  • Sivaraman Balakrishnan
  • Larry A. Wasserman
چکیده

We consider the goodness-of-fit testing problem of distinguishing whether the data are drawn from a specified distribution, versus a composite alternative separated from the null in the total variation metric. In the discrete case, we consider goodness-of-fit testing when the null distribution has a possibly growing or unbounded number of categories. In the continuous case, we consider testing a Lipschitz density, with possibly unbounded support, in the low-smoothness regime where the Lipschitz parameter is not assumed to be constant. In contrast to existing results, we show that the minimax rate and critical testing radius in these settings depend strongly, and in a precise way, on the null distribution being tested and this motivates the study of the (local) minimax rate as a function of the null distribution. For multinomials the local minimax rate was recently studied in the work of Valiant and Valiant [30]. We re-visit and extend their results and develop two modifications to the χ-test whose performance we characterize. For testing Lipschitz densities, we show that the usual binning tests are inadequate in the low-smoothness regime and we design a spatially adaptive partitioning scheme that forms the basis for our locally minimax optimal tests. Furthermore, we provide the first local minimax lower bounds for this problem which yield a sharp characterization of the dependence of the critical radius on the null hypothesis being tested. In the low-smoothness regime we also provide adaptive tests, that adapt to the unknown smoothness parameter. We illustrate our results with a variety of simulations that demonstrate the practical utility of our proposed tests.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hypothesis Testing for High-Dimensional Multinomials: A Selective Review

The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson. In this survey we review some recently developed methods for testing hypotheses about high-dimensional multinomials. Traditional tests like the χ-test and the likelihood ratio test can have poor power in the high-dimensional setting. Much of the research in this a...

متن کامل

A New Method for Sperm Detection in Human Semen: Combination of Hypothesis Testing and Local Mapping of Wavelet Sub-Bands

Introduction Automated methods for sperm characterization in microscopic videos have some limitations such as: low contrast of the video frames and possibility of neighboring sperms to touch each other. In this paper a new method is introduced for detection of sperms in microscopic videos. Materials and Methods In this work, first microscopic videos are captured from specimens of human semen. S...

متن کامل

Optimal Calibration for Multiple Testing against Local Inhomogeneity in Higher Dimension

Based on two independent samples X1, ...,Xm and Xm+1, ...,Xn drawn from multivariate distributions with unknown Lebesgue densities p and q respectively, we propose an exact multiple test in order to identify simultaneously regions of significant deviations between p and q. The construction is built from randomized nearest-neighbor statistics. It does not require any preliminary information abou...

متن کامل

On the Minimax Optimality of Block Thresholded Wavelets Estimators for ?-Mixing Process

We propose a wavelet based regression function estimator for the estimation of the regression function for a sequence of ?-missing random variables with a common one-dimensional probability density function. Some asymptotic properties of the proposed estimator based on block thresholding are investigated. It is found that the estimators achieve optimal minimax convergence rates over large class...

متن کامل

Minimax testing of a composite null hypothesis defined via a quadratic functional in the model of regression

We consider the problem of testing a particular type of composite null hypothesis under a nonparametric multivariate regression model. For a given quadratic functional Q, the null hypothesis states that the regression function f satisfies the constraint Q[f ] = 0, while the alternative corresponds to the functions for which Q[f ] is bounded away from zero. On the one hand, we provide minimax ra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1706.10003  شماره 

صفحات  -

تاریخ انتشار 2017